Using Pig as a data preparation language for large-scale mining software repositories studies: An experience report
نویسندگان
چکیده
The Mining Software Repositories (MSR) field analyzes software repository data to uncover knowledge and assist development of ever growing, complex systems. However, existing approaches and platforms for MSR analysis face many challenges when performing large-scale MSR studies. Such approaches and platforms rarely scale easily out of the box. Instead, they often require custom scaling tricks and designs that are costly to maintain and that are not reusable for other types of analysis. We believe that the web community has faced many of these software engineering scaling challenges before, as web analyses oftware engineering ining Software Repositories
منابع مشابه
An Optimal Model for Medicine Preparation Using Data Mining
Introduction: Lack of financial resources and liquidity are the main problems of hospitals. Pharmacies are one of the sectors that affect the turnover of hospitals and due to lack of forecast for the use and supply of medicines, at the end of the year, encounter over-inventory, large volumes of expired medicines, and sometimes shortage of medicines. Therefore, medicine prediction using availabl...
متن کاملAn Optimal Model for Medicine Preparation Using Data Mining
Introduction: Lack of financial resources and liquidity are the main problems of hospitals. Pharmacies are one of the sectors that affect the turnover of hospitals and due to lack of forecast for the use and supply of medicines, at the end of the year, encounter over-inventory, large volumes of expired medicines, and sometimes shortage of medicines. Therefore, medicine prediction using availabl...
متن کاملSoftware Mining Studies: Goals, Approaches, Artifacts, and Replicability
The mining of software archives has enabled new ways for increasing the productivity in software development: Analyzing software quality, mining project evolution, investigating change patterns and evolution trends, mining models for development processes, developing methods of integrating mined data from various historical sources, or analyzing natural language artifacts in software repositori...
متن کاملPerform Three Data Mining Tasks with Crowdsourcing Process
For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...
متن کاملDeclarative Visitors to Ease Fine-grained Source Code Mining with Full History on Billions of AST Nodes by Robert Dyer, Hridesh Rajan, and Tien N. Nguyen
Software repositories contain a vast wealth of information about software development. Mining these repositories has proven useful for detecting patterns in software development, testing hypotheses for new software engineering approaches, etc. Specifically, mining source code has yielded significant insights into software development artifacts and processes. Unfortunately, mining source code at...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Systems and Software
دوره 85 شماره
صفحات -
تاریخ انتشار 2012